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Abstract 

Mathematical models have been central to ecology for nearly a century. Simple models of population 
dynamics have allowed us to understand fundamental aspects underlying the dynamics and stability of 
ecological systems. What has remained a challenge, however, is to meaningfully interpret experimental or 
observational data in light of mathematical models. Here, we review recent developments, notably in the 
growing field of approximate Bayesian computation (ABC), that allow us to calibrate mathematical 
models against available data. Estimating the population demographic parameters from data remains a 
formidable statistical challenge. Here, we attemptto give a flavor and overview of ABC and its applications 
in population biology and ecology and eschew a detailed technical discussion in favor of a general 
discussion of the advantages and potential pitfalls this framework offers to population biologists. 



Introduction 

Theoretical population biology has been crucial for our 
understanding of ecosystems [1]. Mathematical models 
can explain elegantly what might appear as bewilderingly 
complex variations in species abundances. Seminal work 
starting in the early 20"^ century [2-4] has, in fact, become 
so familiar to population biologists and beyond that today 
we are hardly surprised to see complex oscillatory patterns 
or complex dependencies of population dynamics on a 
myriad of environmental and demographic factors [5]. 
Many of these phenomena can straightforwardly be 
explained in terms of relatively simple population 
dynamics models. The success of these models has also 
meant that ecological ideas are coming to pervade the 
analysis of other interacting systems, including cancer [6], 
stem cells [7,8], and even the banidng system [9,10], all of 
which are characterized by the interactions between 
different entities that affect the overall dynamics of the 
system and its stability. 

Simple models are beguiling and shape our intuition and 
allow us to explain trends in data. In many important 



scenarios, however, different factors come together with 
sometimes complex patterns resulting from their inter- 
play. Thus, understanding realistic systems — subject to a 
multitude of internal and external factors — is hard 
[11,12]. This is further complicated in situations where 
models are used to make predictions or assess different 
types of interventions in silico prior to their implementa- 
tion in, for example, conservation biology. 

These challenges are not unique to theoretical ecology, of 
course, and recent years have seen concerted efforts to 
tackle the so-called inverse problem: estimating para- 
meters of a model from data [13]; choosing from among 
a set of plausible candidate models the model that is best 
able to explain the data [14]; or inferring mechanistic or 
statistical dependencies between the different state 
variables making up a system — in an ecological case, 
this would, for example, be the species considered in the 
model. Below, we are considering population dynamical 
models where a vector, x, containing N species, 
describes the abundances of species in the 
ecosystem. These are assumed to change as a result of 
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interactions among the species (and potentially external 
factors) according to some rate laws, 



dx 
dt 



= f(x;e), 



(1) 



where, with slight abuse of notations, we will also 
implicitly allow for stochastic dynamics. The community 
matrix of the ecological system (1) is, of course, given by 

which captures the ecological relationships among the 
species. Finally, the (vector- valued) parameter 9 denotes 
the typically unknown demographic and system para- 
meters (for example, birth, death, and migration rates) as 
well as parameters characterizing the interactions 
between and within species. 

Below, we will discuss methods that allow us to infer the 
parameters, 6, and choose between different potential 
models (for example, fi,f2, ...,fif).The statistical toolset 
that we will discuss, centered primarily around ABC 
[15,16], complements traditional mathematical 
approaches that have been used in theoretical population 
biology to great effect since the 1950s. But the aim here 
is — rather than to focus on general mathematical laws 
governing the behavior and fate of natural populations — 
to make models as specific to a given problem, to identify 
the key factors driving an ecosystem's dynamics, or to 
make predictions about the future of an ecosystem. 

There are well-defined statistical frameworks to deal with 
parameter inference. Model selection — the process of 
comparing the ability of different models to explain 
some data — is continuing to attract the attention of 
statisticians and domain experts in different scientific 
disciplines [14]. But for many challenging real-world 
problems, conventional statistical approaches become 
computationally too cumbersome very quickly. This 
class of problems includes many stochastic processes, 
highly structured populations, and those where different 
types of data need to be considered. Often, it is still 
straightforward to establish simulation models — in gen- 
eral, real-world problems tend to defy purely analytical 
approaches — but conventional statistical approaches 
become computationally too expensive. 

Arguably, many of the most contentious problems in 
population biology (or science in general) fall into this 
category of problems. A model abstracts from reality 
what are known or believed to be the essential features of 



a real natural (or technological or social) system. This 
fact alone has in the past added to some controversies: as 
"all models are wrong" [17], it is necessary to identify the 
best model that captures and allows us to quantitatively 
and qualitatively understand the dynamics of the real 
system. Thus, we need statistical tools that allow us to 
deal with complex systems, many of which are expected 
to stretch conventional statistics. Here, we develop a 
viable alternative that maintains most if not all of the 
advantages of the Bayesian inferential apparatus but can 
be extended to problems defying conventional statistics. 

Model calibration and parameter estinnation 

Given a model, f(x;0), and some data, D = {di, ...,dn}, 
we need to infer the parameters, 6, from the data. The 
likelihood [18] is defined as the probability of obtaining 
some data D given a parameter value 0', 



L{&) = Pr{D\&) 



(2) 



This is the central quantity in likelihood inference; 
crucially, the likelihood contains all the information 
about the parameter that can be extracted from the data 
D. In Bayesian inference [19], it, together with the prior 
distribution of 6, Pr(0), strikes a balance between what 
is or can be known about the parameter prior to having 
seen the data, and the information contained in the data, 
to give rise to the posterior distribution. 



Pr{e\D) 



Pr{D\e)Pr{e) 
Pr(D) 



(3) 



Here, Pr(D) denotes the evidence. It is often thought of 
as a normalization constant but does in fact contain 
information about the ability of a model to describe the 
data. 

Obtaining the posterior distribution, or a sample from 
it, is computationally demanding. In general, compu- 
ting the evidence Pr(D), which is typically a high- 
dimensional integral, is complicated. Sometimes, the 
focus, therefore, may shift from consideration of the 
whole (posterior) distribution to the maximum (mode) 
of the posterior distribution; this maximum a posteriori 
estimate is the Bayesian equivalent to the maximum 
likelihood estimate. 

So that the additional information contained in the 
distribution can be obtained, a wealth of computational 
statistical approaches have been developed. Markov 
chain Monte Carlo (MCMC) methods have become the 
main workhorses of computational Bayesian statistics 
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and have allowed us to generate samples from the 
posterior distribution. Recent years have witnessed 
increased interest in these and related methods — such 
as population and sequential Monte Carlo techniques — 
but even the most sophisticated approaches reach their 
limits when the number of parameters or the complexity 
of the model increases. The first problem, the so-called 
curse of dimensionality, is shared by all statistical 
inference procedures. 

The second problem is more interesting. For example, 
we may ask ourselves whether there are simpler 
versions, Ma(6a), of the model that we are considering, 
Mo(0)/ that would, despite the simplification or coarse- 
graining (by simplification we typically mean that the 
dimension of the parameter vector is smaller in the 
simplified model, that is, |6a|<|6|), allow us to draw 
meaningful, verifiable (or falsifiable) mechanistic 
insights from the available data. In principle, this 
might appear to exacerbate the statistical problem, for 
we would have to find computationally affordable and 
sufficiently discriminatory ways of deciding if and when 
a simpler model Ma{Qa) is a good approximation to the 
original model, Mo{Q). We will return to this point 
again below. First, however, we discuss ABC methods, 
which form an alternative approach to tackling statis- 
tically challenging problems in a Bayesian framework 
and which have become a popular alternative to 
conventional (or exact) Bayesian inference in many 
applications, especially in evolutionary, population, 
and systems biology. 

Approximate Bayesian computation 

In ABC, we stay as close as possible [16] to the model of 
interest but instead forgo evaluation of the likelihood in 
favor of a comparison between simulated and real data 
[15,20,21]. For many systems, the likelihood becomes 
computationally intractable, either because of the model 
complexity or the detailed nature of the data. Nevertheless, 
the underlying model can still be simulated. The principal 
underlying insight of ABC is that we can consider 

Pr{D\e) = limPr(A[D,De]<e)|e), (4) 

where De is data obtained by simulating from our model 
with parameter 0, A[x,x'] is a distance function that can be 
chosen flexibly to suit the problem at hand, and e is a 
tolerance threshold that reflects the desired accuracy of our 
inference. The essential problem is that for any compli- 
cated problem, it is impossible to obtain the precise 
dataset, D, by simulating from the model Dq, even if we 
know the tme parameter (we ignore the artificial problem 



of deterministic dynamics with no observational noise). 
By increasing the threshold e, our inference becomes more 
approximate, but the chance of obtaining a simulated 
dataset for which A[D, Dq] < e increases. 

The comparison of real and simulated data is particularly 
straightforward for ecological time-series data, for 
example [22]. Here, D might take the form of vectors 
of population abundances, x^, for n species collected at 
t=l,...,m time-points. In this case, the Euclidean (or any 
other vector) norm provides a suitable distance. The 
analysis of dynamical systems is thus relatively straight- 
forward in an ABC framework. Ceneralization to 
compartmental or spatio-temporal models (or both) is 
straightforward [23]: if we can simulate data efficiently, 
we can appeal to the Bayesian inference formalism via 
ABC (keeping in mind the nature of the approximation 
and the tolerance threshold e). 

Instead of comparing the data, we can compare aspects 
of the data, such as summary statistics. This has been one 
of the main advantages as well as sources of contention 
for ABC inference. We call a statistic, s of the data, 
sufficient if and only if 

Pr{D\e,s) =Pr{D\s). 

In this case, we can replace the data by the sufficient 
statistic without any loss of information about the 
parameter, 9. The attraction of using sufficient summary 
statistics lies in the fact that their dimension, ds, is typically 
much smaller than the dimension of the data itselfiin (in 
the above example of n species sampled at m time points, 
do = mxn); that is, ds«do- Especially in population 
genetics, which has inspired the rise of ABC methods since 
the late 1990s, the use of summary statistics has been 
popular (see, for example, [24-28]). With the use of 
summary statistics, the likelihood can be written as 

Pr(D|e) limPr(A[s(D),5(De)]<e)|e), (5) 

E-i-O 

with potentially an appropriate change in the distance 
function and e. 

Although Equation 5 works very well for parameter 
inference if 5 is sufficient, it is important to note that 
sufficient statistics are few and far between for any real- 
world problem. Unfortunately, ABC requires appropriate 
sufficient statistics (or comparisons of the data directly, 
as in the case of time-series problems). There have been 
attempts to generate collections of statistics that together 
fulfil sufficiency properties [28-33], but these are 
computationally expensive in their own right. 
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So far, we have implicitly considered ABC in a simple 
rejection framework: (a) we sample a parameter from a 
suitable prior, (b) we simulate the model for the 
parameter, and then (c) we compare the simulated and 
the real data (or their respective summary) statistics and 
accept a parameter as a draw from the ABC posterior if 
the distance is below some threshold. Steps (a) to (c) are 
repeated until a sufficiently large number of parameter 
values have been accepted. The posterior in this case is 
represented as a sum over indicator functions, 

N 

PrABc(6|D) I where either A[D,D-e,]<e or 

1=1 

A[s{D),s{De.)]<e, depending on whether the data or 
sufficient summaries are used in the inference process. 

This framework is as simple as it is impractical: like all 
rejection samplers, it is limited to small problems 
involving less than a handful of parameters. It has been 
possible to construct ABC-MCMC samplers [21], but the 
real workhorses of most ABC approaches to real-world 
problems are based on sequential Monte Carlo (SMC) 
approaches [22,34]; ABC-SMC has become a very 
popular field of research (arguably inspiring more 
detailed analysis also in exact SMC samplers), and recent 
developments are allowing us to tackle larger and more 
complicated systems [35] . The most widely used flavor of 
ABC-SMC proceeds by constmcting a set of intermediate 
distributions that start from the prior and increasingly 
resemble the posterior. To do so, a sequence of 
decreasing thresholds, ei>e2>...ej< (with ejc = e), is 
defined and the sequence of distributions is constructed 
by sampling parameter vectors from the previous 
distribution (or the prior in the first step), perturbing 
them by using some perturbation kernel function, and 
accepting those parameter vectors for which the distance 
between real and simulated data falls below the thresh- 
old Efe. Choice of the thresholds and the nature of the 
perturbation kernels determine the computational effi- 
ciency and runtime of the inference, but both can be 
tuned to speed up the process and tackle larger problems 
[36,37]. 

Model selection and checking 

So far, we have assumed that we have a single model that 
describes our system of interest. The Bayesian framework 
readily provides us with credible intervals for parameters, 
but it is also possible to assign probabilities for different 
models to be the correct model, conditional on the 
available data and the set of competing models, 
MjeM = {Ml, ...,Mfe}. In the likelihood framework, the 
comparison of general (that is, non-nested) models is 
made possible only through the use of information 
criteria; in the Bayesian framework, the posterior 



probability of a model is given analogously to Equation 
3 by [14] 



Pr{Mi\D) 



Pr{D\Mi)Pr{Mi) 
Pr(D) ■ 



(6) 



which is also known as the marginal likelihood of model 
Mi. In principle, Bayesian model selection allows us to 
compare any number of arbitrary models. An additional 
advantage is that the selection via the marginal 
likelihood. Equation 6, automatically strikes a 
balance between the ability of models to reproduce or 
explain the observed data, the complexity of the model, 
and the robustness of the inference. 

Equation 6 can be interpreted in the ABC framework 
[22,38,39], and ABC model selection has been an area of 
great interest and activity [40-45]. Although model 
selection is indeed straightforward if experimental and 
simulated data are compared directly, it has been shown 
that model selection becomes unreliable when summary 
statistics instead of the data are compared [46,47]: 
summary statistics are sufficient for model selection for 
only a very restricted set of problems. Constructing sets 
of statistics that are sufficient for model selection (they 
must be sufficient for every model considered and across 
the models; this is an area of active research [48,49]), 
while possible in principle, is computationally enor- 
mously demanding. 

In many ecological problems, however, we deal with 
spatio-temporal time-series data, for which model 
selection is possible. Our aim in such cases is typically 
to identify the most promising mechanistic descriptions 
of a complex system. If no single model emerges from 
such a comparison, then we need to investigate those 
models that have comparably high marginal likelihoods. 
Simulations from the respective model posteriors can 
then be used, for example, to develop more discrimina- 
tory experimental designs that allow us to further 
distinguish among these models [50]. This, too, is an 
area of continuing importance for ABC. 

Applicability of approxinnate Bayesian 
connputation: an outlook 

ABC methods were borne out of a need to tackle problems 
that defy conventional statistical methodologies. It has 
become clear, however, that whenever suitable Bayesian 
alternatives that do deal with the proper likelihood are 
available, ABC becomes computationally too expensive. 
The reason for this is primarily the fact that the 
representation of the posterior (as a weighted sum over 
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Dirac 6 functions) is not very efficient. So when 
alternatives are available, they ought to be used. In parallel 
to their role in computationally demanding applications, 
ABC techniques have, more recently, also attracted 
attention as an inferential framework in their own right 
[16,51]. From this, interesting new approaches to deal 
with real-world problems may well emerge [52]. 

In conclusion, ABC-based methods are best suited to 
those problems for which other likelihood-based (or 
exact) Bayesian inference procedures do not yet exist. 
This appears to still include a host of challenging and 
interesting problems. Many stochastic and highly struc- 
tured spatio-temporal problems in ecology, epidemiol- 
ogy, and evolutionary genetics clearly fall into this 
category. The recent developments discussed above 
mean that ABC has become a viable new way of tackling 
computationally demanding parameter inference pro- 
blems. Given a model — as long as we can simulate it — 
ABC gives us a handle to evaluate approximate posterior 
distributions, which then can be further evaluated. 
Sensitivity and robustness analyses, but also predictions 
of future behavior or the likely effects of any interven- 
tions or perturbations, can be analyzed by simulating 
the model with parameters sampled from the posterior. 
There is enormous scope for basing the exploration of, 
for example, policy or conservation measures on the 
available data in this way. ABC has, for example, been 
used in experimental design [50,53] and in synthetic 
biology [14,54] to generate designs of molecular path- 
ways that exhibit certain types of behavior. In such 
cases, we replace the observed data, D, by a represen- 
tation of the desired behavior (such as the desired 
abundance of a species). Then the inference procedure is 
used to identify the scenario for which we are most 
likely to observe this outcome. Such predictions then 
reflect the best available evidence in light of the data 
and the model. 

As an aside, it is worth keeping in mind that the technical 
challenges of statistical inference and modeling can often 
be minor compared with the difficulties in communicat- 
ing the results to policy makers or the general public. Many 
of the most pressing problems in ecology have become 
highly emotive topics as they nearly always involve a 
conflict between parties that have very different priorities 
(see, for example, [45,55]). In many complicated situa- 
tions, the nuance and cautiousness that accompany how 
we present such analyses could be taken for wavering or 
lack of reliability. Here, however, ABC, with its explicit 
focus on simulation, may even have an advantage, as the 
underlying rationale is so straightforwardly explained and 
easy to understand. 
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ABC, approximate Bayesian computation; MCMC, Mar- 
kov chain Monte Carlo; SMC, sequential Monte Carlo. 
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